Accelerated Greedy Mixture Learning
نویسندگان
چکیده
Mixture probability densities are popular models that are used in several data mining and machine learning applications, e.g., clustering. A standard algorithm for learning such models from data is the Expectation-Maximization (EM) algorithm. However, EM can be slow with large datasets, and therefore approximation techniques are needed. In this paper we propose a variational approximation to the greedy EM algorithm which offers speedups that are at least linear in the number of data points. Moreover, by strictly increasing a lower bound on the data log-likelihood in every learning step, our algorithm guarantees convergence. We demonstrate the proposed algorithm on a synthetic experiment where satisfactory results are obtained.
منابع مشابه
Accelerated EM - based clustering of large data sets 1
Motivated by the poor performance (linear complexity) of the EM algorithm 5 in clustering large data sets, and inspired by the successful accelerated versions of related 6 algorithms like k-means, we derive an accelerated variant of the EM algorithm for Gaussian 7 mixtures that: (1) offers speedups that are at least linear in the number of data points, (2) 8 ensures convergence by strictly incr...
متن کاملGossip-Based Greedy Gaussian Mixture Learning
It has been recently demonstrated that the classical EM algorithm for learning Gaussian mixture models can be successfully implemented in a decentralized manner by resorting to gossip-based randomized distributed protocols. In this paper we describe a gossip-based implementation of an alternative algorithm for learning Gaussian mixtures in which components are added to the mixture one after ano...
متن کاملEfficient Greedy Learning of Gaussian Mixture Models
This article concerns the greedy learning of gaussian mixtures. In the greedy approach, mixture components are inserted into the mixture one after the other. We propose a heuristic for searching for the optimal component to insert. In a randomized manner, a set of candidate new components is generated. For each of these candidates, we find the locally optimal new component and insert it into th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004